NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Power-aware Deep Learning Model Serving with µ-Serve. In Proceedings of the 2024 USENIX Annual Technical Conference (ATC 2024).

Qiu, H; Mao, W; Patke, A; Cui, S; Jha, S; Wang, C; Franke, H; Kalbarczyk, Z; Basar, T; Iyer, R (September 2024, Usenix_Atc_24)
Begnum, Kyrre; Border, Charles (Ed.)
With the increasing popularity of large deep learning model serving workloads, there is a pressing need to reduce the energy consumption of a model-serving cluster while maintaining satisfied throughput or model-serving latency requirements. Model multiplexing approaches such as model parallelism, model placement, replication, and batching aim to optimize the model-serving performance. However, they fall short of leveraging the GPU frequency scaling opportunity for power saving. In this paper, we demonstrate (1) the benefits of GPU frequency scaling in power saving for model serving; and (2) the necessity for co-design and optimization of fine grained model multiplexing and GPU frequency scaling. We explore the co-design space and present a novel power-aware model-serving system, μ-Serve. μ-Serve is a model-serving framework that optimizes the power consumption and model serving latency/throughput of serving multiple ML models efficiently in a homogeneous GPU cluster. Evaluation results on production workloads show that μ-Serve achieves 1.2–2.6× power saving by dynamic GPU frequency scaling (up to 61% reduction) without SLO attainment violations.
more » « less
Full Text Available
Power-aware Deep Learning Model Serving with µ-Serve

Qiu, H; Mao, W; Patke, A; Cui, S; Jha, S; Wang, C; Franke, H; Kalbarczyk, Z; Basar, T; Iyer, R (September 2024, Usenix_Atc_24)
Begnum, Kyrre; Border, Charles (Ed.)
With the increasing popularity of large deep learning model-serving workloads, there is a pressing need to reduce the energy consumption of a model-serving cluster while maintaining satisfied throughput or model-serving latency requirements. Model multiplexing approaches such as model parallelism, model placement, replication, and batching aim to optimize the model-serving performance. However, they fall short of leveraging the GPU frequency scaling opportunity for power saving. In this paper, we demonstrate (1) the benefits of GPU frequency scaling in power saving for model serving; and (2) the necessity for co-design and optimization of fine-grained model multiplexing and GPU frequency scaling. We explore the co-design space and present a novel power-aware model-serving system, μ-Serve. μ-Serve is a model-serving framework that optimizes the power consumption and model-serving latency/throughput of serving multiple ML models efficiently in a homogeneous GPU cluster. Evaluation results on production workloads show that μ-Serve achieves 1.2–2.6× power saving by dynamic GPU frequency scaling (up to 61% reduction) without SLO attainment violations.
more » « less
Full Text Available
Integrating HPC, AI, and Workflows for Scientific Data Analysis (Dagstuhl Seminar 23352)

Gainaru, A; Jha, S; Kirkpatrick, C; Laney, D; Nagel, W_E; Rybicki, J; Talia, D (March 2024, Dagstuhl Reports (DagRep))

Full Text Available
Spectral dataset of young type Ib supernovae and their time evolution

https://doi.org/10.1051/0004-6361/202452214

Yesmin, N; Pellegrino, C; Modjaz, M; Baer-Way, R; Howell, D A; Arcavi, I; Farah, J; Hiramatsu, D; Hosseinzadeh, G; McCully, C; et al (January 2025, Astronomy & Astrophysics)

Due to high-cadence automated surveys, we can now detect and classify supernovae (SNe) within a few days after explosion, if not earlier. Early-time spectra of young SNe directly probe the outermost layers of the ejecta, providing insights into the extent of stripping in the progenitor star and the explosion mechanism in the case of core-collapse supernovae. However, many SNe show overlapping observational characteristics at early times, complicating the early-time classification. In this paper, we focus on the study and classification of type Ib supernovae (SNe Ib), which are a subclass of core-collapse SNe that lack strong hydrogen lines but show helium lines in their spectra. Here we present a spectral dataset of eight SNe Ib, chosen to have at least three pre-maximum spectra, which we call early spectra. Our dataset was obtained mainly by the Las Cumbres Observatory (LCO) and it consists of a total of 82 optical photospheric spectra, including 38 early spectra. This dataset increases the number of published SNe Ib with at least three early spectra by ∼60%. For our classification efforts, we used early spectra in addition to spectra taken around maximum light. We also converted our spectra into SN IDentification (SNID) templates and make them available to the community for easier identification of young SNe Ib. Our dataset increases the number of publicly available SNID templates of early spectra of SNe Ib by ∼43%. Half of our sample has SN types that change over time or are different from what is listed on the Transient Name Server (TNS). We discuss the implications of our dataset and our findings for current and upcoming SN surveys and their classification efforts.
more » « less
Full Text Available
When Green Computing Meets Performance and Resilience SLOs.

https://doi.org/10.1109/DSN-S60304.2024

Qiu, H; Mao, W; Wang, C; Jha, S; Franke, H; Narayanaswami, C; Kalbarczyk, ZT; Basar, T; Iyer, R_K (January 2024, Institute of Electrical and Electronics Engineers)
nd (Ed.)
This paper addresses the urgent need to transition to global net-zero carbon emissions by 2050 while retaining the ability to meet joint performance and resilience objectives. The focus is on the computing infrastructures, such as hyperscale cloud datacenters, that consume significant power, thus producing increasing amounts of carbon emissions. Our goal is to (1) optimize the usage of green energy sources (e.g., solar energy), which is desirable but expensive and relatively unstable, and (2) continuously reduce the use of fossil fuels, which have a lower cost but a significant negative societal impact. Meanwhile, cloud datacenters strive to meet their customers’ requirements, e.g., service-level objectives (SLOs) in application latency or throughput, which are impacted by infrastructure resilience and availability. We propose a scalable formulation that combines sustainability, cloud resilience, and performance as a joint optimization problem with multiple interdependent objectives to address these issues holistically. Given the complexity and dynamicity of the problem, machine learning (ML) approaches, such as reinforcement learning, are essential for achieving continuous optimization. Our study highlights the challenges of green energy instability which necessitates innovative MLcentric solutions across heterogeneous infrastructures to manage the transition towards green computing. Underlying the MLcentric solutions must be methods to combine classic system resilience techniques with innovations in real-time ML resilience (not addressed heretofore). We believe that this approach will not only set a new direction in the resilient, SLO-driven adoption of green energy but also enable us to manage future sustainable systems in ways that were not possible before.
more » « less
Full Text Available
Concept-based explanations for out-of-distribution detectors

Choi, J; Raghuram, J; Feng, R; Chen, J; Jha, S; Prakash, A. (July 2023, International Conference on Machine Learning)

Out-of-distribution (OOD) detection plays a crucial role in ensuring the safe deployment of deep neural network (DNN) classifiers. While a myriad of methods have focused on improving the performance of OOD detectors, a critical gap remains in interpreting their decisions. We help bridge this gap by providing explanations for OOD detectors based on learned high-level concepts. We first propose two new metrics for assessing the effectiveness of a particular set of concepts for explaining OOD detectors: 1) detection completeness, which quantifies the sufficiency of concepts for explaining an OOD-detector’s decisions, and 2) concept separability, which captures the distributional separation between in-distribution and OOD data in the concept space. Based on these metrics, we propose an unsupervised framework for learning a set of concepts that satisfy the desired properties of high detection completeness and concept separability, and demonstrate its effectiveness in providing concept-based explanations for diverse off-the-shelf OOD detectors. We also show how to identify prominent concepts contributing to the detection results, and provide further reasoning about their decisions.
more » « less
Full Text Available
Integrating HPC, AI, and Workflows for Scientific Data Analysis (Dagstuhl Seminar 23352)

Gainaru, A; Jha, S; Kirkpatrick, C; Laney, D; Nagel, W; Rybicki, J; Talia, D (March 2023, Dagstuhl Reports (DagRep))

Full Text Available
Final Moments. III. Explosion Properties and Progenitor Constraints of CSM-interacting Type II Supernovae

https://doi.org/10.3847/1538-4357/adfa23

Jacobson-Galán, W V; Dessart, L; Davis, K W; Bostroem, K A; Kilpatrick, C D; Margutti, R; Filippenko, A V; Foley, R J; Chornock, R; Terreran, G; et al (October 2025, The Astrophysical Journal)

Abstract We present analysis of the plateau and late-time phase properties of a sample of 39 Type II supernovae (SNe II) that show narrow, transient, high-ionization emission lines (i.e., “IIn-like”) in their early-time spectra from interaction with confined, dense circumstellar material (CSM). Originally presented by W. V. Jacobson-Galán et al., this sample also includes multicolor light curves and spectra extending to late-time phases of 35 SNe with no evidence for IIn-like features at <2 days after first light. We measure photospheric phase light-curve properties for the distance-corrected sample and find that SNe II with IIn-like features have significantly higher luminosities and decline rates at +50 days than the comparison sample, which could be connected to inflated progenitor radii, lower ejecta mass, and/or persistent CSM interaction. However, we find no statistical evidence that the measured plateau durations and⁵⁶Ni masses of SNe II with and without IIn-like features arise from different distributions. We estimate progenitor zero-age main-sequence (ZAMS) masses for all SNe with nebular spectroscopy through spectral model comparisons and find that most objects, both with and without IIn-like features, are consistent with progenitor masses ≤12.5M_⊙. Combining progenitor ZAMS masses with CSM densities inferred from early-time spectra suggests multiple channels for enhanced mass loss in the final years before core collapse, such as a convection-driven chromosphere or binary interaction. Finally, we find spectroscopic evidence for ongoing ejecta-CSM interaction at radii >10¹⁶cm, consistent with substantial progenitor mass-loss rates of ∼10⁻⁴–10⁻⁵M_⊙yr⁻¹(v_w < 50 km s⁻¹) in the final centuries to millennia before explosion.
more » « less
Free, publicly-accessible full text available October 8, 2026
Principal Component Flows

Cunningham E.; Cobb A.; Jha, S (April 2022, Proceedings of the 39th International Conference on Machine Learning, PMLR 162:4492-4519)

Normalizing flows map an independent set of latent variables to their samples using a bijective transformation. Despite the exact correspondence between samples and latent variables, their high level relationship is not well understood. In this paper we characterize the geometric structure of flows using principal manifolds and understand the relationship between latent variables and samples using contours. We introduce a novel class of normalizing flows, called principal component flows (PCF), whose contours are its principal manifolds, and a variant for injective flows (iPCF) that is more efficient to train than regular injective flows. PCFs can be constructed using any flow architecture, are trained with a regularized maximum likelihood objective and can perform density estimation on all of their principal manifolds. In our experiments we show that PCFs and iPCFs are able to learn the principal manifolds over a variety of datasets. Additionally, we show that PCFs can perform density estimation on data that lie on a manifold with variable dimensionality, which is not possible with existing normalizing flows.
more » « less
Full Text Available
Detecting Out-Of-Context Objects Using Graph Context Reasoning Network

Acharya, M; Roy, A; Koneripalli, K; Jha, S; Kanan, C; Divakaran, A (July 2022, IJCAI)

This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other objects, and the position of the object in the scene. We posit that contextual cues are useful to determine object labels for in-context objects and inconsistent context cues are detrimental to determining object labels for out-of-context objects. To realize this hypothesis, we propose a graph contextual reasoning network (GCRN) to detect OOC objects. GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects. GCRN explicitly captures the contextual cues to improve the detection of in-context objects and identify objects that violate contextual relations. In order to evaluate our approach, we create a large-scale dataset by adding OOC object instances to the COCO images. We also evaluate on recent OCD benchmark. Our results show that GCRN outperforms competitive baselines in detecting OOC objects and correctly detecting in-context objects.
more » « less
Full Text Available

« Prev Next »

Search for: All records